OAM: An Option-Action Reinforcement Learning Framework for Universal Multi-Intersection Control
نویسندگان
چکیده
Efficient traffic signal control is an important means to alleviate urban congestion. Reinforcement learning (RL) has shown great potentials in devising optimal plans that can adapt dynamic However, several challenges still need be overcome. Firstly, a paradigm of state, action, and reward design needed, especially for optimality-guaranteed function. Secondly, the generalization RL algorithms hindered by varied topologies physical properties intersections. Lastly, enhancing cooperation between intersections needed large network applications. To address these issues, Option-Action framework universal Multi-intersection (OAM) proposed. Based on well-known cell transmission model, we first define lane-cell-level state better model flow propagation. this queuing dynamics, propose regularized delay as facilitate temporal credit assignment while maintaining equivalence with minimizing average travel time. We then recapitulate phase actions constrained combinations lane options neural structure realize any intersection definition. The multiple-intersection rigorously discussed using potential game theory. test OAM algorithm under four networks different settings, including city-level scenario 2,048 synthetic real-world datasets. results show outperform state-of-the-art controllers reducing
منابع مشابه
Multi-Task Deep Reinforcement Learning for Continuous Action Control
In this paper, we propose a deep reinforcement learning algorithm to learn multiple tasks concurrently. A new network architecture is proposed in the algorithm which reduces the number of parameters needed by more than 75% per task compared to typical single-task deep reinforcement learning algorithms. The proposed algorithm and network fuse images with sensor data and were tested with up to 12...
متن کاملA Laplacian Framework for Option Discovery in Reinforcement Learning
Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for representation learning in MDPs. In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned re...
متن کاملReinforcement Learning for Multi - Linked Manipulator Control
We present an automatic trajectory planning and obstacle avoidance method for a multi-linked manipulator which uses position and velocity sensor information directly to produce the appropriate real-valued torques for each joint. The inputs are fed into a Cerebellar Model Arithmetic Computer (CMAC) [1] and in each state, the expected reward and torques for each joint are learnt through self-expe...
متن کاملExploring Multi-action Relationship in Reinforcement Learning
In many real-world reinforcement learning problems, an agent needs to control multiple actions simultaneously. To learn under this circumstance, previously, each action was commonly treated independently with other. However, these multiple actions are rarely independent in applications, and it could be helpful to accelerate the learning if the underlying relationship among the actions is utiliz...
متن کاملAction Selection Methods Using Reinforcement Learning 1 Action Selection 1.1 Multi-module Reinforcement Learning
Action Selection schemes, when translated into precise algorithms, typically involve considerable design eeort and tuning of parameters. Little work has been done on solving the problem using learning. This paper compares eight diierent methods of solving the action selection problem using Reinforcement Learning (learning from rewards). The methods range from centralised and cooperative to dece...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i4.20378